Outlier Detection with Globally Optimal Exemplar-Based GMM
نویسندگان
چکیده
Outlier detection has recently become an important problem in many data mining applications. In this paper, a novel unsupervised algorithm for outlier detection is proposed. First we apply a provably globally optimal Expectation Maximization (EM) algorithm to fit a Gaussian Mixture Model (GMM) to a given data set. In our approach, a Gaussian is centered at each data point, and hence, the estimated mixture proportions can be interpreted as probabilities of being a cluster center for all data points. The outlier factor at each data point is then defined as a weighted sum of the mixture proportions with weights representing the similarities to other data points. The proposed outlier factor is thus based on global properties of the data set. This is in contrast to most existing approaches to outlier detection, which are strictly local. Our experiments performed on several simulated and real life data sets demonstrate superior performance of the proposed approach. Moreover, we also demonstrate the ability to detect unusual shapes.
منابع مشابه
Improving Exemplar-based Image Completion methods using Selecting the Optimal Patch
Image completion is one of the subjects in image and video processing which deals with restoration of and filling in damaged regions of images using correct regions. Exemplar-based image completion methods give more pleasant results than pixel-based approaches. In this paper, a new algorithm is proposed to find the most suitable patch in order to fill in the damaged parts. This patch selection ...
متن کاملIdentification of outliers types in multivariate time series using genetic algorithm
Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...
متن کاملOutlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means
One of the most important concerns of a data miner is always to have accurate and error-free data. Data that does not contain human errors and whose records are full and contain correct data. In this paper, a new learning model based on an extreme learning machine neural network is proposed for outlier detection. The function of neural networks depends on various parameters such as the structur...
متن کاملEffective Image and Video Error Concealment using RST-Invariant Partial Patch Matching Model and Exemplar-based Inpainting
An effective visual error concealment method has been presented by employing a robust rotation, scale, and translation (RST) invariant partial patch matching model (RSTI-PPMM) and exemplar-based inpainting. While the proposed robust and inherently feature-enhanced texture synthesis approach ensures the generation of excellent and perceptually plausible visual error concealment results, the outl...
متن کاملOptimal Feature Based Density Clustering for Outlier Detection in Multivariate Data
Efficient outlier detection in a large-sized big data environment incurs much of complexity in processing the information and to handle it in a proficient way. For segregating outliers from those normal data items, many of the prevailing methodologies experiences complexity in accordance with the features involved in every single attribute. On recognizing appropriate features associated the cha...
متن کامل